A sequential algorithm for fast clique percolation 
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In complex network research clique percolation, introduced by Palla et ai, is a deterministic 
community detection method, which allows for overlapping communities and is purely based on 
local topological properties of a network. Here we present a sequential clique percolation algorithm 
(SCP) to do fast community detection in weighted and unweighted networks, for cliques of a chosen 
size. This method is based on sequentially inserting the constituent links to the network and 
simultaneously keeping track of the emerging community structure. Unlike existing algorithms, the 
SCP method allows for detecting fc-clique communities at multiple weight thresholds in a single run, 
and can simultaneously produce a dendrogram representation of hierarchical community structure. 
In sparse weighted networks, the SCP algorithm can also be used for implementing the weighted 
clique percolation method recently introduced by Parkas et al. The computational time of the SCP 
algorithm scales linearly with the number of fc-cliques in the network. As an example, the method 
is applied to a product association network, revealing its nested community structure. 
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I. INTRODUCTION 



Over the last decade, complex networks have become 
a standard framework in the study of complex sys- 
tems [l|, The simplicity of the network representa- 
tion, where the interactions and interacting elements are 
mapped to links and nodes, respectively, facilitates its use 
on a number of systems, ranging from human societies 
to biological systems. One prominent feature of com- 
plex networks is related to their mesoscopic properties. 
Networks often display modular structure, i.e., arc struc- 
tured in terms of modules or communities, which are, in 
general, sets of densely interconnected nodes. Such com- 
munities are often closely related to functional units of 
the system, for example groups of individuals interact- 
ing with each other in society @i I3j S @] i or functional 
modules in metabolic networks [7|, |8|, |9| . 

The problem of detecting communities in complex net- 
works has received a lot of attention during the last years. 
This problem is twofold: first, there is no unique way 
to rigorously define what constitutes a community. For 
any definition, several choices have to be made: whether 
communities are defined using local or global network 
properties, whether nodes can participate in several com- 
munities, and whether the definition allows for weighted 
networks and nested hierarchy of communities. Second, 
any definition is useful in practice only if it can be re- 
formulated as an algorithm which scales well enough to 
allow processing networks of large enough size. As a re- 
sult, a large number of community definitions and their 
algorithmic implementations have been proposed over the 
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recent years [13 [TTl, [H [H, [11 [H ; for a review see [11]. 

In this paper we focus on a fast algorithmic imple- 
mentation of the clique percolation (CP) method, orig- 
inally introduced by Palla et al. [^. The CP method 
is deterministic and it is based solely on local topolog- 
ical properties, defining a fc-clique community as a set 
of nodes belonging to adjacent fc-cliques. This allows 
for overlapping communities, i.e., nodes having multiple 
community memberships. The CP method has earlier 
been successfully applied to various community detec- 
tion problems: detection of protein communities related 
to cancer metastasis |17| , analysis of communities in co- 
authorship, word-association and protein-interaction net- 
works [^, and time evolution of social groups [^. In 
contrary to existing implementations [isj . which detect 
fc-clique communities for all values of fc by first finding 
the maximal cliques by an exponentially scaling algo- 
rithm 9], we focus on rapid detection of communities 
for a chosen value of fc. Our sequential clique percolation 
(SCP) algorithm is based on sequentially inserting links 
to the network and keeping track of the emerging com- 
munity structure. It has specifically been designed for 
weighted networks containing hierarchical communities 
which are reflected in the link weights. When links are 
inserted in decreasing order of weight, the algorithm al- 
lows for detecting fc-clique communities at chosen thresh- 
old levels in a single run and simultaneously produces 
a dendrogram representation of hierarchical community 
structure. In addition, the algorithm can be used for very 
fast community detection for unweighted networks. 

This paper is structured as follows: first, we present 
our algorithm for the simplest, unweighted case, and dis- 
cuss its scaling properties. We then move on to detect- 
ing nested communities in weighted networks, applying 
the algorithm to a product association network gener- 
ated from data on sellers and products on an online auc- 
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tion site. Finally, we discuss a variation of the algorithm 
which is based on ordering /c-cliques according to their 
weighted properties, and present our conclusions. 

II. THE SCP ALGORITHM 

Let us begin by defining fc-cliques and fc-clique com- 
munities [gl. [Toj: 

• A /c-clique is a set of k nodes which are all con- 
nected to each other. A fc-clique community, or 
fc-community, is a set of nodes which can be reached by 
a series of overlapping fc-cliques, where overlap means 
that the fc-cliques share fc — 1 nodes. 

It should be noted that 2-cliques correspond to pairs 
of nodes connected by single links and 1-cliques to single 
nodes. Given a network F, the goal is then to find the 
fc-communities defined as above. In our case, we restrict 
ourselves to some specific values of fc. Usually choosing 
fc = 3 or fc = 4 yields useful information, and currently 
these values of fc have yielded, to our knowledge, the most 
relevant communities in practical applications [13, 
[20j . Our algorithm is based on detecting and storing fc- 
communities as they emerge and consolidate when links 
are sequentially inserted into the network. One can think 
of the process as first "removing" each link I from the 
network F, and then inserting them back one by one. 
For unweighted networks, the links can be inserted in any 
order, whereas for weighted networks, it may be desirable 
to sort the links by weight. 

Our algorithm for detecting fc-communities consists of 
two phases: the first phase of the algorithm detects fc- 
cliques which form when a link is inserted. These are then 
fed to the second phase, which keeps track of formation 
and merging of fc-communities by processing the found 
fc-cliques. The two parts of the algorithm are described 
in detail below. 



A. Phase I: Detecting the fc-cliques 

The first part of the algorithm involves detecting fc- 
cliques which are formed when a link is inserted into the 
network. Suppose now that the inserted link connects 
nodes Vi and Vj (see Fig.[T]). The minimum requirement 
for a new fc-clique to form is that nodes Vi and Vj both 
have degree of at least fc — 1. If this is the case, the algo- 
rithm proceeds by collecting all nodes that are neighbors 
of both nodes, Mij = Mi flA/'j , where M denotes neighbor- 
hood. Now, when the link Uj is added, each (fc — 2)-clique 
contained in the set Mij will give rise to a new fc-clique. 
Therefore, all newly formed fc-cliques are found by detect- 
ing all the (fc — 2)-cliques in the Mij . For commonly used 
small clique sizes, this is very fast: for 3-cliques, (fc — 2)- 
cliques are single nodes, while for fc = 4, all connected 
pairs of nodes in Mij give rise to a new 4-clique. 




FIG. 1: Schematic illustration of the process for detecting 
the fc-cliques a newly inserted link completes. The dashed 
line depicts the new link, inserted between nodes Vi and 
Vj. The common neighbors of nodes Vi and Vj are Mij = 
{vm,Vn,Vp,Vq\- For detecting newly formed 4-cliques, all 
pairs of nodes in Mij are checked to see if they are connected, 
that is, if they form a 2-clique. Each 2-clique in the set gives 
rise to a 4-clique, so in total the link Uj will generate three 

4- cliques. In the case fc = 5, only one 3-clique is found, which 
contains the nodes Vm, Vn and Vp. It will give rise to a single 

5- clique including these nodes in addition to Vi and Vj. 



Next the fc-cliques detected as above are fed one by one 
into the second phase of the algorithm. 

B. Phase II: Detecting the fc-communities 

The second phase of the algorithm detects and keeps 
track of fc-communities which form and merge when new 
fc-cliques are input from the first phase. Because a fc- 
community is defined as a set of nodes which all can be 
reached by a series of overlapping fc-cliques, the crucial 
issue here is the efficient detection of overlap between fc- 
cliques. A naive approach would be to search for shared 
sets of fc — 1 nodes between the newly input clique and 
all existing cliques. However, the required computational 
effort makes this approach unpractical. Instead, we take 
advantage of the sequential nature of the process by "lo- 
cally" detecting possible overlap of each new fc-clique with 
existing fc-communities and by updating the community 
structure accordingly. 

Let us begin by noting that the fc-community structure 
of a network can be represented by a bipartite network, 
where the two types of nodes represent fc-cliques and (fc — 
l)-cliques. In this network, a link exists between two 
nodes of different type if the fc-clique contains the (fc — 1)- 
clique as a sub-clique. This is illustrated in Fig. [2l The 
usefulness of this representation becomes apparent in the 
following: each connected component in this bipartite 
network corresponds to a fc-clique community, because by 
definition fc-cliques belonging to the same community are 
connected through shared (fc — l)-cliques. Furthermore, 
connected components of the unipartite projections of 
the bipartite network 30] similarly correspond to fc-clique 
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FIG. 2: Illustration of the algorithm for detecting fc-clique 
communities in a simple example network. Here, A; = 3. a) 
The original network F consists of three 3-cliques labeled A, 
B, and C. 2-cliques, i.e., nodes connected by single links, 
are labeled with lower case letters, b) Bipartite network pre- 
sentation of the clique structure. Note that in the bipartite 
network, the 3-cliques B and C, which form a 3-clique com- 
munity, are connected by the shared 2-clique /. Clique A 
forms another 3-clique community, c) 3-cliques detected by 
the first part of the algorithm as links are sequentially inserted 
into the network. Each new fc-clique is denoted by red nodes 
whereas nodes associated with existing fc-cliques appear gray, 
d) Corresponding updates to the (fc — l)-chque network F* 
as a result of the second part of the algorithm, fc-clique com- 
munities correspond to connected components of this network 
(shaded areas). 



another fc-clique; if not, they are created at this stage. 
Finally, Hnks are created between members of this set of 
k nodes, and resulting changes in the connected compo- 
nent structure of F* are recorded. 

In the algorithmic implementation, things can be done 
somewhat more efficiently, resembling techniques used in 
link percolation. The actual network F* does not need 
to be constructed, as it is enough to keep track of its 
connected components, i.e., the component indices of 
its nodes v*. This is equal to link percolation in F*, 
which can be implemented for example with disjoint-set 
forests [2l|. At this stage it is enough to ensure that all 
(fc — l)-clique- nodes corresponding to the new fc clique 
are marked to belong to the same component (the new 
(fc — l)-cliques and their links may either form a new con- 
nected component, merge with an existing component, or 
join together at most fc existing components). 

The above process is then repeated for each fc-clique 
input from Phase I. Finally, once all links have been in- 
serted (Phase I) and the subsequently formed fc-cliques 
handled (Phase II) , the fc-communities of the original net- 
work F can be read from the component indices of v*, 
assigning nodes of F to their corresponding communities. 

In theory, it would also be possible to keep track of the 
connected components of the whole bipartite network or 
alternatively project the bipartite network to fc-cliques 
instead of (fc — l)-cliques. Both representations contain 
the same connected components and would thus yield the 
same fc-clique communities. However, the former alter- 
native is unnecessarily complicated as it involves nodes 
of two types. The latter implementation is not as com- 
putationally effective as the current choice in cases where 
a newly inserted fc-clique overlaps with a large number of 
existing fc-cliques. 



communities. In the following, we focus on the (fc — 1)- 
clique projection of this bipartite network. We denote 
the network resulting from this projection by F*. In this 
unipartite network, nodes v* represent the (fc — l)-cliques 
of F, and links /* exist between nodes which are sub- 
cliques of the same fc-clique. 

For the sake of clarity, we will first present a "physical" 
interpretation of Phase II of the algorithm, and then dis- 
cuss the algorithmic implementation where certain short- 
cuts can be made. Similarly to Phase I, where the orig- 
inal network F is reconstructed link by link. Phase II of 
the SCP algorithm sequentially builds up F* from the fc- 
cliques brought forward from Phase I. At the same time, 
it keeps track of the connected components of F* (see 
Fig. [51 panels c and d). These correspond to fc-clique 
communities. When a new fc-clique is input from Phase 
I, its constituent (fc — l)-cliques are first extracted; ob- 
viously there are always fc of such sub-cliques. Each of 
these (fc — 1) cliques corresponds to a node in F*. Some 
of these nodes may already be present, if the correspond- 
ing (fc — l)-cliques have been handled earlier as part of 



C. Scaling of the algorithm 

Let us next discuss the performance of the SCP algo- 
rithm, before moving on to its application to weighted 
network analysis. Obviously, the computational time re- 
quired to process a network depends on its properties; 
here, we wish to investigate the performance as a function 
of network size and the number of fc-cliques contained in 
the network. To do this, we have applied the SCP algo- 
rithm on three types of networks with adjustable sizes. 
The first test case (GN), introduced by Girvan and New- 
man , contains built-in communities and has often been 
used for similar purposes. The GN networks used here 
consist of groups of 32 nodes, where each node has on 
the average 12 links to nodes of the same group and 4 
links to other groups. The network size N is varied by 
changing the number of such groups. The second type of 
networks (WSN) is generated using a recently published 
model of weighted social networks with communities [2^ , 
using parameter values similar to the original reference. 
As the third type, we have used co-authorship networks 
based on the cond-mat archive (CM), constructed sim- 
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FIG. 3: Computation time of the algorithm for three values of k, as a function of the number of fc-chques (upper row) and 
network size (lower row). Symbols denote different test networks: GN (■), WSN (a), and CM {■<), see text for details. The 
solid line is a linear reference. For comparison, we have also plotted the computational time of the CFinder 1.21 algorithm for 
the GN networks (►). Note that CFinder always processes all values of k. 



ilarly to e.g. [23j . However, in order to vary the net- 
work size, we have used time windows of varying length, 
such that two authors are connected if they have pub- 
hshed a joint paper during the time window. It should 
be noted that although the WSN networks are inherently 
weighted, and the CM networks can also be considered 
such, here we consider binary versions of both types for 
the performance analysis. 

Results in Fig. [3] show that the computational time of 
the SCP algorithm grows practically linearly as a func- 
tion of the number of fc-cliques for all networks. This 
is as expected, because the computational time of the 
algorithm is dominated by the process of detecting k- 
cliques and processing them for overlap, such that each 
/c-clique is processed exactly twice. This is also reflected 
in the network size dependence of the required compu- 
tational time for both types of model networks (GN, 
WSN). For these networks the local structure remains 
essentially unchanged as the network grows and it ap- 
pears that the number of fc-cliques grows linearly with 
A''. However, for the CM networks, the computational 
time grows faster than linearly as a function of network 
size. This is because the CM network is a projection of 



a bipartite author-publication network containing large 
cliques that grow in size when N increases. The problem 
is, as pointed out by Palla et al. Q, that the number of 
sub-cliques of size k within a clique of size s is (^) . In 
the limit s ^ k this leads to 



s 



(1) 



Hence for large s, the number of fc-cliques grows as fc*'* 
power of s, meaning that for networks containing large 
cliques the SCP method performs best for rather small 
values of fc. For example, when fc > 10 the analysis of the 
largest CM networks becomes extremely slow with the 
SCP method. However, when very large cliques are not 
abundant in the network under investigation, the SCP 
algorithm is very fast even for networks of large size. 
For example, detecting 4-clique communities in a mo- 
bile phone call network having approximately 4 million 
nodes and 6 million links |24| takes approximately one 
minute on a standard desktop computer. Thus, for net- 
works where cliques are on the average fairly small, the 
main practical limitations of this algorithm seem to be 
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related to the memory consumption as it requires keeping 
all {k — l)-cliques of the network in memory. 

Finally, let us compare the performance of the SCP 
algorithm and the existing method (CFinder 1.21, Q). 
Evidently, this comparison is somewhat complicated, 
as CFinder simultaneously processes all clique sizes, 
whereas the SCP algorithm is by construction limited 
to a single value of k. Nevertheless, summing up the pro- 
cessing times for all values of fc, we have observed that 
for the GN network, the processing time of the SCP algo- 
rithm scales linearly with network size, whereas CFinder 
1.21 appears to scale as iV^ (see Fig. [3]). However, for 
denser networks, such as the CM network, the compari- 
son becomes somewhat meaningless as both methods be- 
come extraordinarily slow. This is due to the very large 
number of fc-cliques as discussed above. It should be 
noted here that the unpublished beta version, CFinder 
2.0b, appears to scale far better than CFinder 1.21 and 
seems to be able to deal with very large cliques. How- 
ever, the key strength of the SCP algorithm is its speed 
in weighted network analysis: it is able to process mul- 
tiple weight thresholds in a single run (see Section fill Al 
below). With the earlier method, this quickly becomes 
unfeasible, as the networks corresponding to each thresh- 
old have to be separately input and analyzed. Thus, even 
if the processing time of both methods would be exactly 
the same for a single network, obtaining the fc-community 
structure for 100 weight thresholds would be 100 times 
faster with the SCP algorithm. Another important differ- 
ence is the inherent ability of the SCP method to produce 
a dendrogram of nested fc-communities; this feature does 
not exist in earlier implementations (again, see Section 
below). 



III. SCP FOR WEIGHTED NETWORKS 
A. Thresholding and nested communities 

Let us move on to weighted networks, where the con- 
cept of community structure becomes somewhat more 
complicated. Perhaps only for the very simplest cases, 
where the networks are sparse, weights can be disre- 
garded, such that communities are associated with the 
pure topology of the network. However, this is usually 
not feasible, as weighted networks can be rather dense, 
even to such an extent that the topology no longer mat- 
ters, as any modular structure is encoded in the link 
weights only. This is the case for example in stock in- 
teraction networks [25], whose natural representation is 
a weight matrix with only nonzero elements. 

For such networks, one is essentially left with two 
choices: the first is to threshold the network, such that 
links whose weights are considered insignificantly small 
are removed and communities in the resulting sparse net- 
work are detected. It is evident that choosing the right 
threshold is a non-trivial task; in fact, for many cases 
it may be better to take a multi-resolution approach. 



by investigating the resulting community structure for 
a range of thresholds. Another option is to consider the 
weights directly when defining what constitutes a com- 
munity, and to apply a method which is based on this 
definition [HIH. 

In the original formulation of the clique percolation 
algorithm, Palla et al. suggested a rule for choosing a 
weight threshold w* for the network, such that the re- 
sulting fc-community structure would be as diverse as 
possible [oj . More specifically, w* is chosen such that the 
largest community is twice the size of the second largest 
one, i.e., below the percolation threshold where a giant 
fc-clique community appears. For the original implemen- 
tation, the algorithm had to be run from the beginning 
for each threshold level. One of the benefits of our ap- 
proach is that it allows for obtaining the fc-communities 
at any point of the process of adding links, which is just 
thresholding done in reverse: If the links of the origi- 
nal network F are sorted and processed in descending 
order of weight, the algorithm yields for each link the fc- 
community structure of F thresholded by the weight of 
the link. This is very useful for selecting the threshold, 
as all threshold values can be processed in a single run. 
Note that for dense networks, sweeping through the en- 
tire range of weights is not needed: the algorithm can be 
stopped before (or immediately after) communities are 
entirely "smeared out" by a giant community. Stopping 
the algorithm in time can greatly reduce the workload 
in dense networks as usually only a small fraction of all 
links need to be added before the percolating component 
is found, after which adding more links does not increase 
the number of nodes in the communities, but only makes 
the community denser in cliques. 

However, by focusing on a single threshold weight, 
valuable information of the community structure con- 
tained in the correlations between weights can be lost. 
Often, the modular structure of networks is inherently hi- 
erarchical - denser and stronger communities are nested 
inside weaker ones, which may further be embedded in- 
side even weaker ones 

[li (2^, l27i, |2|- It is then natu- 
ral to investigate this nestedness by considering the de- 
velopment of the community structure when the weight 
threshold is swept through the range of interest. Evi- 
dently, this requires book-keeping of the emergence and 
merging of communities as the threshold is progressively 
lowered. For the SCP algorithm, this book-keeping is in- 
built: all necessary information can directly be recorded 
in Phase II of the algorithm. In particular, it is easy to 
store when a fc-community appears, which nodes belong 
to it, how its size grows as new fc-cliques join it, and 
when it merges with other fc-communities. It should be 
stressed here that this is a genuine advantage: separately 
detecting the community structure for each threshold and 
then tracking the formation and merging of communities 
would be very difficult and time-consuming. 

This information on the nested community structure 
is best visualized with a dendrogram, which is a common 
presentation format in agglomerative community detec- 
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FIG. 4: Dendrogram visualization of the nested fc-community structure of the trading categories of the Finnish online auction 
site Huuto.net for fc = 3 (a) and k — 4 (b). 



tion (see, e.g., [28|). In a dendrogram, horizontal lines 
correspond to communities, and a branching of the lines 
denotes communities merging. Choosing a single weight 
threshold would correspond to taking a vertical slice of 
the dendrogram. Fig.[4]shows two examples of the nested 
community structure within a product category network, 
for fc = 3 and k — A. This network is constructed from 
online trading data, downloaded from the Finnish auc- 
tion website Huuto.net. In this network, nodes corre- 
spond to product categories {N — 345), and the weights 
of links connecting two categories to the number of in- 
dividuals who have been trading in both of them. This 
network is very dense, the number of links is 52536, corre- 
sponding to a link density p = 0.89, and thus the network 
can be considered as a suitable test case for the evolu- 
tion of community structure while sweeping the thresh- 
old weight. In Fig. [4] the labels associated with each 
community describe their dominant product categories. 
Although the dendrograms formed by using fc = 3 and 
fc = 4 are not identical, several similar communities ap- 
pear for both values. From the commonsensical point of 
view, these appear natural: electronic devices and com- 
puter components merge to a single community, as do 
music and movies, and children's and women's clothing. 

Often it is not possible nor meaningful to include all fc- 
communities in such a visualization: the outcome would 
be too complicated to be interpreted by visual inspection. 
The main problem are the numerous single fc-cliques, 
which merge to larger fc-communities. For any analy- 
sis of the dendrogram structure the entire data should be 
used but for visualization purposes it is useful to thresh- 
old the dendrogram such that only fc-communities which 
are larger than a threshold size Nt^ appear in the plot. In 
Fig. [5] fc-communities of sizes larger than fc are displayed, 
i.e.. Nth = k. 



B. Weighted fc-clique percolation 

As pointed out above, considering the weights in the 
definition of what constitutes a community is an alterna- 
tive to simply discarding low- weight links. Such an exten- 
sion for clique percolation has recently been introduced 
by Farkas et al. in [20,]. In this method, each fc-clique is 
assigned a "weight", which equals the intensity [1^ of its 
edge weights. The intensity is defined as the geometric 
mean of the link weights in the fc-clique. The commu- 
nity structure is then obtained by choosing an intensity 
threshold /* and taking into account only those fc-cliques 
whose intensity is above /*. 

For our SCP algorithm, a simple modification allows 
for weighted clique percolation according to the above 
scheme. To achieve this, instead of building the fc- 
communities simultaneously as the fc-cliques emerge, all 
links are first inserted to the network and the resulting 
fc-cliques are stored. Then, the intensity of each of these 
fc-cliques is calculated, and the cliques are sorted with 
respect to the intensity. Finally, the sorted fc-cliques are 
processed one by one by the second part of the algo- 
rithm, until the intensity threshold is reached. Multiple 
thresholding levels are obtained as before, but now with 
respect to fc-clique intensities, and a dendrogram can be 
constructed similarly. Note that in addition to intensity, 
any other measure describing the "weight" of the cliques 
can be used, £.17., if homogeneous cliques are sought for, 
one could also take the clique coherence [2^ into account. 
Sorting cliques according to their intensities was briefly 
described by Farkas et al. in [2C|; their construction ap- 
pears somewhat similar to ours as the intensity-sorted 
cliques are handled in succession, and the method for ob- 
taining overlapping fc-communities seems to correspond 
to building the whole bipartite network between fc- and 
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{k — l)-cliques. 

The above procedure requires keeping all fc-cliques in 
the memory in addition to the (fc — l)-cliques. In most 
cases the loss of speed is minimal, as the additional com- 
putational load is related to the memory consumption 
and sorting of cliques, which can be done in log-linear 
time. However, a possible problem related to the SCP 
algorithm - and the weighted clique percolation method 
in general - is that all fc-cliques have to be processed indi- 
vidually, and their number can be very large in dense net- 
works as discussed in Section Hi CI When the link weight 
thresholding procedure of Section IIII Al is applied, this 
problem can be somewhat circumvented by simply stop- 
ping the algorithm as soon as enough links have been 
inserted for obtaining the community structure at the 
desired "resolution". However, for intensity-based clique 
percolation this cannot be done, as all /c-cliques have to 
be detected and sorted first. 



IV. CONCLUSIONS 

We have introduced a sequential clique percolation 
algorithm for detecting fc-clique communities in a net- 
work by sequentially inserting its edges and keeping track 
of the emerging community structure [3l| . This algo- 
rithm has specifically been designed for (dense) weighted 
networks, where weight-based thresholding of either the 
links or the cliques formed by them is necessary for ob- 
taining meaningful information on the structure. By ap- 
plying the algorithm on test networks, we have shown 
that the computational time required to process a net- 
work scales linearly with the number of fc-cliques in the 



network. The sequential nature of the algorithm allows 
run-time construction of a dendrogram presentation of 
the nested hierarchical fc-community structure, which we 
have illustrated using a product category network. 

The main tradeoff for our algorithm is that it detects 
the fc-communities for a chosen value of fc with multiple 
weight thresholds in a single run, instead of obtaining 
fc-communities for all values of fc with a single weight 
threshold as is done in the maximal clique algorithms. 
Hence the SCP algorithm can be considered complemen- 
tary to earlier presented solutions @ . Neither of these 
algorithms can be argued to be strictly better or faster 
than the other as their performance depends heavily on 
the network topology and other aspects of the problem 
they are solving. The SCP algorithm is particularly use- 
ful when a small clique size fc is used and when mul- 
tiple weight threshold levels need to be studied, or no 
prior knowledge of the proper threshold level of a dense 
weighted network is at hand. The algorithm can also be 
considered as a reasonable choice for very large sparse 
networks as suggested by the short computation times of 
the community structure of a mobile telephony network 
having millions of nodes and links. 
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